Chapter 7 Global and local spatial autocorrelation
This session we begin to …
library(sf)
library(tmap)
library(dplyr)
library(sp)
library(spdep)
7.1 Get data
So, let’s start by getting some data. We are going to take some of the data from past weeks. In getting the data ready you will have one more opportunity to practice how to read data into R but also how to perform some basic spatial checks, transformations and operations. It may seem like you have already done some of this stuff. But that’s the point: to force you to practice. The more you do this stuff, the easier it will be and -trust us- eventually things will click and become second nature. First let’s get the LSOA boundary data.
shp_name <- "data/BoundaryData/england_lsoa_2011.shp"
manchester_lsoa <- st_read(shp_name)
## Reading layer `england_lsoa_2011' from data source `/Users/reka/Dropbox (The University of Manchester)/crimemapping_textbook_bookdown/data/BoundaryData/england_lsoa_2011.shp' using driver `ESRI Shapefile'
## Simple feature collection with 282 features and 3 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 378833.2 ymin: 382620.6 xmax: 390350.2 ymax: 405357.1
## epsg (SRID): NA
## proj4string: +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +datum=OSGB36 +units=m +no_defs
Now check the coordinate system.
st_crs(manchester_lsoa)
## Coordinate Reference System:
## No EPSG code
## proj4string: "+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +datum=OSGB36 +units=m +no_defs"
There is no EPSG code assigned, but notice the datum is given for BNG. Let’s address this.
lsoa_WGS84 <- st_transform(manchester_lsoa, 4326)
st_crs(lsoa_WGS84)
## Coordinate Reference System:
## EPSG: 4326
## proj4string: "+proj=longlat +datum=WGS84 +no_defs"
plot(st_geometry(lsoa_WGS84))

rm(manchester_lsoa)
Let’s add the burglary data from Greater Manchester. We have practiced this code in previous sessions so we won’t go over it on detail again, but try to remember and understand what each line of code rather than blindly cut and paste. If you don’t understand what each of these lines of codes is doing call us out.
#Read into R
crimes <- read.csv("https://raw.githubusercontent.com/jjmedinaariza/CrimeMapping/master/gmpcrime.csv")
#Filter out to select burglary
burglary <- filter(crimes, crime_type == "Burglary")
#Transform into spatial object
burglary_spatial = st_as_sf(burglary, coords = c("long", "lat"),
crs = 4326, agr = "constant")
#Remove redundant non spatial burglary object
rm(burglary)
rm(crimes)
#Select burglaries that intersect with the Manchester city LSOA map.
bur_mc <- st_intersects(lsoa_WGS84, burglary_spatial)
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
bur_mc <- burglary_spatial[unlist(bur_mc),]
#Check results
plot(st_geometry(bur_mc))

#Remove redundant objects
rm(burglary_spatial)
We have the burglaries, let’s now count how many burglaries there is within each LSOA polygon. This is a point in polygon operation that we cover a couple of weeks ago. If the code or the notion does not make much sense to you make sure you review the relevant session from week 4.
#Point in polygon spatial operation
burglaries_per_lsoa <- bur_mc %>%
st_join(lsoa_WGS84, ., left = FALSE) %>%
count(code)
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
#Let's rename the column with the count of burglaries (n) into something more meaningful
burglaries_per_lsoa <- rename(burglaries_per_lsoa, burglary = n)
#Plot with tmap
tm_shape(burglaries_per_lsoa) +
tm_fill("burglary", style = "quantile", palette = "Reds") +
tm_borders(alpha = 0.1) +
tm_layout(main.title = "Burglary counts", main.title.size = 0.7 ,
legend.position = c("right", "bottom"), legend.title.size = 0.8)

Do you see any patterns? Are burglaries randomly spread around the map? Or would you say that areas that are closer to each other tend to be more alike? Is there evidence of clustering? Do burglaries seem to appear in certain pockets of the map? In this session we are going to discuss ways in which you can quantify the answer to this question. We will discuss measures of global spatial autocorrelation, which essentially aim to answer the degree to which areas that are near each other tend to be more alike. We say global because we are interested in the degree of clustering not on the location of the clusters. Later we will also cover techniques to identify local clusters of autocorrelation, but for now we will focus in quantifying whether areas are (on average) alike their neighbours.
7.2 What is a neighbour?
Previously I asked whether areas are alike their neighbours or to areas that are close. But what is a neighbour? Or what do we mean by close? How can one define a set of neighbours for each area? If we want to know if what we measure in a particular area is similar to what happens on its neighbouring areas, we need to establish what we mean by a neighbour.
There are various ways of defining a neighbour. We can say that two areas are neighbours if they share boundaries, if they are next to each other. In this case we talk of neighbours by contiguity. By contiguous you can, at the same time, mean all areas that share common boundaries (what we call contiguity using the rook criteria, like in chess) or areas that share common boundaries and common corners, that is, that have any point in common (and we call this contiguity using the queen criteria).
When defining neighbours by contiguity we may also specify the order of contiguity. First order contiguity means that we are focusing on areas immediately contiguous. Second order means that we consider neighbours only those areas that are immediately contiguous to our first order neighbours (only the yellow areas in the figure below) and you could go on and on. Look at the graphic below for clarification:
Figure 1
Alternatively we may define neighbours by distance. You can consider neighbours those areas that distant-wise are close to each other (regardless of whether boundaries are shared). In other words, areas will be defined as neighbours if they are within a specified radius.
In sum, adjacency is an important concept in some spatial analysis. In some cases objects are considered ajacent when they “touch”, e.g. neighboring countries. Contiguity measures tend to be more common when studying areas. It can also be based on distance. This is the most common approach when analyzing point data, but can also be relevant when studying areas.
You will come across the term spatial weight matrix at some point or, using mathematical notation, W. Essentially the spatial weight matrix is a n by n matrix with ones and zeroes (in the case of contiguity based definitions) identifying if any two observations are neighbours. So you can think of the spatial weight matrix as the new data table that we are constructing with our definition of neighbours, whichever this is.
How do you build such a matrix with R? Well, let’s turn to that. But to make things a bit simpler, let’s focus not on the whole of Manchester, but just in the LSOAs within the city centre. We will use familiar code to clip the spatial object with the counts of burglaries to only those that intersect with the City Centre ward. Again, we have covered this code elsewhere, so we won’t explain here in detail. But don’t just cut and paste, if there is anything in this code you don’t fully understand you are expected to ask us.
#Read a geojson file with Manchester wards
manchester_ward <- st_read("https://raw.githubusercontent.com/RUMgroup/Spatial-data-in-R/master/rumgroup/data/wards.geojson")
## Reading layer `OGRGeoJSON' from data source `https://raw.githubusercontent.com/RUMgroup/Spatial-data-in-R/master/rumgroup/data/wards.geojson' using driver `GeoJSON'
## Simple feature collection with 215 features and 12 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 351664 ymin: 381168.6 xmax: 406087.5 ymax: 421039.8
## epsg (SRID): 27700
## proj4string: +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +towgs84=446.448,-125.157,542.06,0.15,0.247,0.842,-20.489 +units=m +no_defs
#Create a new object that only has the city centre ward
df1 <- manchester_ward %>%
filter(wd16nm == "City Centre")
#Change coordinate systems
cc_ward <- st_transform(df1, 4326)
#Check if they match those of the imd_gm object
st_crs(cc_ward) == st_crs(burglaries_per_lsoa)
## [1] TRUE
#Get rid of objects we no longer need
rm(manchester_ward)
rm(df1)
#Intersect
cc_intersects <- st_intersects(cc_ward, burglaries_per_lsoa)
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
cc_burglary <- burglaries_per_lsoa[unlist(cc_intersects),]
#Plot with tmap
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(cc_burglary) +
tm_fill("burglary", style = "quantile", palette = "Reds", id="code") +
tm_borders() +
tm_layout(main.title = "Burglary counts", main.title.size = 0.7 ,
legend.position = c("right", "top"), legend.title.size = 0.8)
## legend.postion is used for plot mode. Use view.legend.position in tm_view to set the legend position in view mode.
So now we have a new spatial object cc_burglary with the 23 LSOA units that compose the City Centre of Manchester. By focusing in a smaller subset of areas we can understand perhaps a bit better what comes next. But again we carry on. Do you perceive here some degree of spatial autocorrelation?
7.2.1 Homework 1
The id argument in the tm_fill ensures that when you click over any of the areas you get not only the count of burglaries in that LSOA (the quantity we are mapping) gets displayed within a bubble, but you also get to see the code that identifies that LSOA.
Move your cursor over the LSOA covering Beswick. You will see this area had 58 burglaries in 2017 and that the LSOA identifier is E01033688. Using the rook criteria identify the first order neighbors of this LSOA. List their identifiers. Are things different if we use the queen criteria? If so, how does it change? Now. Think and think hard about what the lecture by Luc Anseling discussed. Have you identified all the neighbours of this area? (there are multiple ways of answering this question, just make sure you reason your answer).
7.3 Creating a list of neighbours
It would be very, very tedious having to identify the neighbours of all the areas in our study area in the way we have done in homework 1. That’s why we love computers. We can automate tedious work so that they do it and we have more time to do fun stuff. We can use code to get the computer to establish what areas are next to each other (if we are using a contiguity based definition of being a neighbour).
We have already discussed how sf is new kid in town and although package developers are quickly trying to adapt their packages to work with sf, many of the existing spatial packages still expect sp objects. So to illustrate the concepts in this week session we are first going to turn our sf object into sp.
#Before we do that I want to turn the factor identifiying the LSOAs into a character vector (since I will need this later on)
cc_burglary$lsoa_code <- as.character(cc_burglary$code)
#We coerce the sf object into a new sp object that we are calling bur_ccsp (remember names are arbitrary)
bur_ccsp <- as(cc_burglary, "Spatial")
In order to identify neighbours we will use the poly2nb() function from the spdep package that we loaded at the beginning of our session. The spdep package provides basic functions for building neighbour lists and spatial weights, tests for spatial autocorrelation for areal data like Moran’s I, and functions for fitting spatial regression models.
This function builds a neigbours list based on regions with contiguous boundaries. If you look at the documentation you will see that you can pass a “queen” argument that takes TRUE or FALSE as options.If you do not specify this argument the default is set to true, that is, if you don’t specify queen = FALSE this function will return a list of first order neighbours using the Queen criteria.
w <- poly2nb(bur_ccsp, row.names=bur_ccsp$lsoa_code)
class(w)
## [1] "nb"
This has created a nb, neighbour list object. We can get some idea of what’s there if we ask for a summary.
summary(w)
## Neighbour list object:
## Number of regions: 23
## Number of nonzero links: 100
## Percentage nonzero weights: 18.90359
## Average number of links: 4.347826
## Link number distribution:
##
## 2 3 4 5 6 7 8
## 3 4 6 5 3 1 1
## 3 least connected regions:
## E01033673 E01033674 E01033684 with 2 links
## 1 most connected region:
## E01033658 with 8 links
This is basically telling us that using this criteria each LSOA polygon has an average of 4.3 neighbours (when we just focus on the city centre) and that all areas have some neighbours (there is no islands). The link number distribution gives you the number of links (neighbours) per area. So here we have 3 polygons with 2 neighbours, 3 with 3, 6 with 4, and so on. The summary function here also identifies the areas sitting at both extreme of the distribution.
For more details we can look at the structure of w.
str(w)
## List of 23
## $ : int [1:4] 2 6 11 23
## $ : int [1:5] 1 3 4 6 8
## $ : int [1:5] 2 5 6 8 12
## $ : int [1:3] 2 8 21
## $ : int [1:7] 3 6 9 10 12 14 18
## $ : int [1:6] 1 2 3 5 10 11
## $ : int [1:3] 9 13 17
## $ : int [1:6] 2 3 4 12 18 21
## $ : int [1:8] 5 7 13 14 17 18 19 20
## $ : int [1:4] 5 6 11 14
## $ : int [1:5] 1 6 10 22 23
## $ : int [1:4] 3 5 8 18
## $ : int [1:3] 7 9 14
## $ : int [1:4] 5 9 10 13
## $ : int [1:4] 16 19 20 21
## $ : int [1:2] 15 19
## $ : int [1:2] 7 9
## $ : int [1:6] 5 8 9 12 20 21
## $ : int [1:4] 9 15 16 20
## $ : int [1:5] 9 15 18 19 21
## $ : int [1:5] 4 8 15 18 20
## $ : int [1:2] 11 23
## $ : int [1:3] 1 11 22
## - attr(*, "class")= chr "nb"
## - attr(*, "region.id")= chr [1:23] "E01005065" "E01005066" "E01005128" "E01005212" ...
## - attr(*, "call")= language poly2nb(pl = bur_ccsp, row.names = bur_ccsp$lsoa_code)
## - attr(*, "type")= chr "queen"
## - attr(*, "sym")= logi TRUE
We can graphically represent the links using the following code:
#We first plot the boundaries
plot(bur_ccsp, col='gray', border='blue', lwd=2)
#Then we use the coordinates function to obtain the coordinates of the polygon centroids
xy <- coordinates(bur_ccsp)
#Then we draw lines between the polygons centroids for neighbours that are listed as linked in w
plot(w, xy, col='red', lwd=2, add=TRUE)

7.4 Generating the weight matrix
We can transform w into a spatial weights matrix. A spatial weights matrix reflects the intensity of the geographic relationship between observations. For this we use the spdep function nb2mat().
wm <- nb2mat(w, style='B')
wm
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
## E01005065 0 1 0 0 0 1 0 0 0 0 1 0
## E01005066 1 0 1 1 0 1 0 1 0 0 0 0
## E01005128 0 1 0 0 1 1 0 1 0 0 0 1
## E01005212 0 1 0 0 0 0 0 1 0 0 0 0
## E01033653 0 0 1 0 0 1 0 0 1 1 0 1
## E01033654 1 1 1 0 1 0 0 0 0 1 1 0
## E01033655 0 0 0 0 0 0 0 0 1 0 0 0
## E01033656 0 1 1 1 0 0 0 0 0 0 0 1
## E01033658 0 0 0 0 1 0 1 0 0 0 0 0
## E01033659 0 0 0 0 1 1 0 0 0 0 1 0
## E01033661 1 0 0 0 0 1 0 0 0 1 0 0
## E01033662 0 0 1 0 1 0 0 1 0 0 0 0
## E01033664 0 0 0 0 0 0 1 0 1 0 0 0
## E01033667 0 0 0 0 1 0 0 0 1 1 0 0
## E01033672 0 0 0 0 0 0 0 0 0 0 0 0
## E01033673 0 0 0 0 0 0 0 0 0 0 0 0
## E01033674 0 0 0 0 0 0 1 0 1 0 0 0
## E01033677 0 0 0 0 1 0 0 1 1 0 0 1
## E01033681 0 0 0 0 0 0 0 0 1 0 0 0
## E01033682 0 0 0 0 0 0 0 0 1 0 0 0
## E01033683 0 0 0 1 0 0 0 1 0 0 0 0
## E01033684 0 0 0 0 0 0 0 0 0 0 1 0
## E01033688 1 0 0 0 0 0 0 0 0 0 1 0
## [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22]
## E01005065 0 0 0 0 0 0 0 0 0 0
## E01005066 0 0 0 0 0 0 0 0 0 0
## E01005128 0 0 0 0 0 0 0 0 0 0
## E01005212 0 0 0 0 0 0 0 0 1 0
## E01033653 0 1 0 0 0 1 0 0 0 0
## E01033654 0 0 0 0 0 0 0 0 0 0
## E01033655 1 0 0 0 1 0 0 0 0 0
## E01033656 0 0 0 0 0 1 0 0 1 0
## E01033658 1 1 0 0 1 1 1 1 0 0
## E01033659 0 1 0 0 0 0 0 0 0 0
## E01033661 0 0 0 0 0 0 0 0 0 1
## E01033662 0 0 0 0 0 1 0 0 0 0
## E01033664 0 1 0 0 0 0 0 0 0 0
## E01033667 1 0 0 0 0 0 0 0 0 0
## E01033672 0 0 0 1 0 0 1 1 1 0
## E01033673 0 0 1 0 0 0 1 0 0 0
## E01033674 0 0 0 0 0 0 0 0 0 0
## E01033677 0 0 0 0 0 0 0 1 1 0
## E01033681 0 0 1 1 0 0 0 1 0 0
## E01033682 0 0 1 0 0 1 1 0 1 0
## E01033683 0 0 1 0 0 1 0 1 0 0
## E01033684 0 0 0 0 0 0 0 0 0 0
## E01033688 0 0 0 0 0 0 0 0 0 1
## [,23]
## E01005065 1
## E01005066 0
## E01005128 0
## E01005212 0
## E01033653 0
## E01033654 0
## E01033655 0
## E01033656 0
## E01033658 0
## E01033659 0
## E01033661 1
## E01033662 0
## E01033664 0
## E01033667 0
## E01033672 0
## E01033673 0
## E01033674 0
## E01033677 0
## E01033681 0
## E01033682 0
## E01033683 0
## E01033684 1
## E01033688 0
## attr(,"call")
## nb2mat(neighbours = w, style = "B")
This matrix has values of 0 or 1 indicating whether the elements listed in the rows are adjacent (using our definition, which in this case was the Queen criteria) with each other. The diagonal is full of zeroes. An area cannot be a neighbour of itself. So, if you look at the first two and the second column you see a 1. That means that the LSOA with the code E01005065 is a neighbour of the second LSOA (as listed in the rows) which is E01005066. You will zeroes for many of the other columns because this LSOA only has 4 neighbours.
7.4.1 Homework 2
Looking at this matrix identify the other 3 neigbours of E01005065
In many computations we will see that the matrix is row standardised. We can obtain a row standardise matrix changing the code:
wm_rs <- nb2mat(w, style='W')
wm_rs
## [,1] [,2] [,3] [,4] [,5] [,6]
## E01005065 0.0000000 0.2500000 0.0000000 0.0000000 0.0000000 0.2500000
## E01005066 0.2000000 0.0000000 0.2000000 0.2000000 0.0000000 0.2000000
## E01005128 0.0000000 0.2000000 0.0000000 0.0000000 0.2000000 0.2000000
## E01005212 0.0000000 0.3333333 0.0000000 0.0000000 0.0000000 0.0000000
## E01033653 0.0000000 0.0000000 0.1428571 0.0000000 0.0000000 0.1428571
## E01033654 0.1666667 0.1666667 0.1666667 0.0000000 0.1666667 0.0000000
## E01033655 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033656 0.0000000 0.1666667 0.1666667 0.1666667 0.0000000 0.0000000
## E01033658 0.0000000 0.0000000 0.0000000 0.0000000 0.1250000 0.0000000
## E01033659 0.0000000 0.0000000 0.0000000 0.0000000 0.2500000 0.2500000
## E01033661 0.2000000 0.0000000 0.0000000 0.0000000 0.0000000 0.2000000
## E01033662 0.0000000 0.0000000 0.2500000 0.0000000 0.2500000 0.0000000
## E01033664 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033667 0.0000000 0.0000000 0.0000000 0.0000000 0.2500000 0.0000000
## E01033672 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033673 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033674 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033677 0.0000000 0.0000000 0.0000000 0.0000000 0.1666667 0.0000000
## E01033681 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033682 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033683 0.0000000 0.0000000 0.0000000 0.2000000 0.0000000 0.0000000
## E01033684 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033688 0.3333333 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [,7] [,8] [,9] [,10] [,11] [,12]
## E01005065 0.0000000 0.0000000 0.0000000 0.0000000 0.2500000 0.0000000
## E01005066 0.0000000 0.2000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01005128 0.0000000 0.2000000 0.0000000 0.0000000 0.0000000 0.2000000
## E01005212 0.0000000 0.3333333 0.0000000 0.0000000 0.0000000 0.0000000
## E01033653 0.0000000 0.0000000 0.1428571 0.1428571 0.0000000 0.1428571
## E01033654 0.0000000 0.0000000 0.0000000 0.1666667 0.1666667 0.0000000
## E01033655 0.0000000 0.0000000 0.3333333 0.0000000 0.0000000 0.0000000
## E01033656 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.1666667
## E01033658 0.1250000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033659 0.0000000 0.0000000 0.0000000 0.0000000 0.2500000 0.0000000
## E01033661 0.0000000 0.0000000 0.0000000 0.2000000 0.0000000 0.0000000
## E01033662 0.0000000 0.2500000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033664 0.3333333 0.0000000 0.3333333 0.0000000 0.0000000 0.0000000
## E01033667 0.0000000 0.0000000 0.2500000 0.2500000 0.0000000 0.0000000
## E01033672 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033673 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033674 0.5000000 0.0000000 0.5000000 0.0000000 0.0000000 0.0000000
## E01033677 0.0000000 0.1666667 0.1666667 0.0000000 0.0000000 0.1666667
## E01033681 0.0000000 0.0000000 0.2500000 0.0000000 0.0000000 0.0000000
## E01033682 0.0000000 0.0000000 0.2000000 0.0000000 0.0000000 0.0000000
## E01033683 0.0000000 0.2000000 0.0000000 0.0000000 0.0000000 0.0000000
## E01033684 0.0000000 0.0000000 0.0000000 0.0000000 0.5000000 0.0000000
## E01033688 0.0000000 0.0000000 0.0000000 0.0000000 0.3333333 0.0000000
## [,13] [,14] [,15] [,16] [,17] [,18] [,19]
## E01005065 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01005066 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01005128 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01005212 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01033653 0.0000000 0.1428571 0.00 0.00 0.0000000 0.1428571 0.000
## E01033654 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01033655 0.3333333 0.0000000 0.00 0.00 0.3333333 0.0000000 0.000
## E01033656 0.0000000 0.0000000 0.00 0.00 0.0000000 0.1666667 0.000
## E01033658 0.1250000 0.1250000 0.00 0.00 0.1250000 0.1250000 0.125
## E01033659 0.0000000 0.2500000 0.00 0.00 0.0000000 0.0000000 0.000
## E01033661 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01033662 0.0000000 0.0000000 0.00 0.00 0.0000000 0.2500000 0.000
## E01033664 0.0000000 0.3333333 0.00 0.00 0.0000000 0.0000000 0.000
## E01033667 0.2500000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01033672 0.0000000 0.0000000 0.00 0.25 0.0000000 0.0000000 0.250
## E01033673 0.0000000 0.0000000 0.50 0.00 0.0000000 0.0000000 0.500
## E01033674 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01033677 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01033681 0.0000000 0.0000000 0.25 0.25 0.0000000 0.0000000 0.000
## E01033682 0.0000000 0.0000000 0.20 0.00 0.0000000 0.2000000 0.200
## E01033683 0.0000000 0.0000000 0.20 0.00 0.0000000 0.2000000 0.000
## E01033684 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## E01033688 0.0000000 0.0000000 0.00 0.00 0.0000000 0.0000000 0.000
## [,20] [,21] [,22] [,23]
## E01005065 0.0000000 0.0000000 0.0000000 0.25
## E01005066 0.0000000 0.0000000 0.0000000 0.00
## E01005128 0.0000000 0.0000000 0.0000000 0.00
## E01005212 0.0000000 0.3333333 0.0000000 0.00
## E01033653 0.0000000 0.0000000 0.0000000 0.00
## E01033654 0.0000000 0.0000000 0.0000000 0.00
## E01033655 0.0000000 0.0000000 0.0000000 0.00
## E01033656 0.0000000 0.1666667 0.0000000 0.00
## E01033658 0.1250000 0.0000000 0.0000000 0.00
## E01033659 0.0000000 0.0000000 0.0000000 0.00
## E01033661 0.0000000 0.0000000 0.2000000 0.20
## E01033662 0.0000000 0.0000000 0.0000000 0.00
## E01033664 0.0000000 0.0000000 0.0000000 0.00
## E01033667 0.0000000 0.0000000 0.0000000 0.00
## E01033672 0.2500000 0.2500000 0.0000000 0.00
## E01033673 0.0000000 0.0000000 0.0000000 0.00
## E01033674 0.0000000 0.0000000 0.0000000 0.00
## E01033677 0.1666667 0.1666667 0.0000000 0.00
## E01033681 0.2500000 0.0000000 0.0000000 0.00
## E01033682 0.0000000 0.2000000 0.0000000 0.00
## E01033683 0.2000000 0.0000000 0.0000000 0.00
## E01033684 0.0000000 0.0000000 0.0000000 0.50
## E01033688 0.0000000 0.0000000 0.3333333 0.00
## attr(,"call")
## nb2mat(neighbours = w, style = "W")
Row standardisation of a matrix ensure that the sum of the rows adds up to 1. So, for example, if you have four neighbours and that has to add up to 4, you need to divide 1 by 4, which gives you 0.25. So in the columns for a polygon with 4 neighbours you will see 0.25 in the column representing each of the neighbours.
7.5 Compute Moran’s I
The most well known measure of spatial autocorrelation is the Moran’s I. It was developed by Patrick Alfred Pierce Moran, an Australian statistician. You can find the formula and some explanation in the wikepedia article. The video lecture by Luc Anselin covers an explanation of Moran’s I. We strongly recommend you watch the video. You can also find helpful this link if things are still unclear. The formula you see may look intimidating but it is nothing but the formula for standard correlation expanded to incorporate the spatial weight matrix.
Before we can use the functions from spdep to compute the global Moran’s I we need to create a ‘listw’ type spatial weights object (instead of the matrix we used above). To get the same value as above we use “style=’B’” to use binary (TRUE/FALSE) distance weights.
(try row standardising)
ww <- nb2listw(w, style='B')
ww
## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 23
## Number of nonzero links: 100
## Percentage nonzero weights: 18.90359
## Average number of links: 4.347826
##
## Weights style: B
## Weights constants summary:
## n nn S0 S1 S2
## B 23 529 100 200 1960
Now we can use the moran function. Have a look at ?moran. The function is defined as ‘moran(y, ww, n, Szero(ww))’. Note the odd arguments n and S0. I think they are odd, because “ww” has that information. Anyway, we supply them and it works. There probably are cases where it makes sense to use other values.
moran(bur_ccsp$burglary, ww, n=length(ww$neighbours), S0=Szero(ww))
## $I
## [1] 0.1203294
##
## $K
## [1] 7.964184
So the Moran’s I here is 0.12, which is not very large. But still. We can now run a test of statistical significance for this statistic.
The Spatial Autocorrelation (Global Moran’s I) tool is an inferential statistic, which means that the results of the analysis are always interpreted within the context of its null hypothesis. For the Global Moran’s I statistic, the null hypothesis states that the attribute being analyzed is randomly distributed among the features in your study area; said another way, the spatial processes promoting the observed pattern of values is random chance. Imagine that you could pick up the values for the attribute you are analyzing and throw them down onto your features, letting each value fall where it may. This process (picking up and throwing down the values) is an example of a random chance spatial process. When the p-value returned by this tool is statistically significant, you can reject the null hypothesis.
In some software you can use statistical tests invoking asymptotic theory, but the only appropriate way of doing these tests is by using a Monte Carlo procedure. The way Monte Carlo works is that the values of burglary are randomly assigned to the polygons, and the Moran’s I is computed. This is repeated several times to establish a distribution of expected values. The observed value of Moran’s I is then compared with the simulated distribution to see how likely it is that the observed values could be considered a random draw.
If confused, watch this quick video on monte carlo simulations.
We use the function moran.mc() to run a permutation test for Moran’s I statistic calculated by using some number of random permutations of our numeric vector, for the given spatial weighting scheme, to establish the rank of the observed statistic in relation to the simulated values.
We need to specify our variable of interest (burglary), the listw object we created earlier (ww), and the number of permutations we want to run (here we choose 99).
set.seed(1234) # The seed number you choose is the starting point used in the generation of a sequence of random numbers, which is why (provided you use the same pseudo-random number generator) you'll obtain the same results given the same seed number.
burg_moranmc_results <- moran.mc(bur_ccsp$burglary, ww, nsim=99)
burg_moranmc_results
##
## Monte-Carlo simulation of Moran I
##
## data: bur_ccsp$burglary
## weights: ww
## number of simulations + 1: 100
##
## statistic = 0.12033, observed rank = 93, p-value = 0.07
## alternative hypothesis: greater
So, the probability of observing this Moran’s I if the null hypothis was true is 0.07. This is higher than our alpha level of 0.05. In this case, we can conclude that there isn’t a significant global spatial autocorrelation.
We can make a “Moran scatter plot” to visualize spatial autocorrelation. Note the row standardisation of the weights matrix.
rwm <- mat2listw(wm, style='W')
# Checking if rows add up to 1 (they should)
mat <- listw2mat(rwm)
#This code is simply adding each row to see if we get one when we add their values up, we are only displaying the first 15 rows in the matrix
apply(mat, 1, sum)[1:15]
## E01005065 E01005066 E01005128 E01005212 E01033653 E01033654 E01033655
## 1 1 1 1 1 1 1
## E01033656 E01033658 E01033659 E01033661 E01033662 E01033664 E01033667
## 1 1 1 1 1 1 1
## E01033672
## 1
Now we can plot:
moran.plot(bur_ccsp$burglary, rwm)

The X axis represents the values of our burglary variable in each unit (each LSOA) and the Y axis represents a spatial lag of this variable. A spatial lag in this context is simply the average value of the burglary count in the areas that are considered neighours of each LSOA. So we are plotting the value of burglary against the average value of burglary in the neighbours. And you can see the correlation is almost flat here. As with any correlation measure, you could get positive spatial autocorrelation, that would mean that as you move further to the right in the X axis you have higher levels of burglary in the surrounding area. This is what we see here. But the correlation is fairly low and as we saw is not statistically significant. You can also obtain negative spatial autocorrelation. That is, that would mean that areas with high level of crime tend (it’s all about the global average effect!) to be surrounded by areas with low levels of crime. This is clearly not what we see here.
It is very important to understand that global statistics like the spatial autocorrelation (Global Moran’s I) tool assess the overall pattern and trend of your data. They are most effective when the spatial pattern is consistent across the study area. In other words, you may have clusters (local pockets of autocorrelation), without having clustering (global autocorrelation). This may happen if the sign of the clusters negate each other.
But don’t just take our word for how important this is, or how it’s commonly applied in criminological research. Instead, now that you’ve gone through on how to do this, and have begun to get a sense of understanding, read the following paper on https://link.springer.com/article/10.1023/A:1007592124641 where the authors make use of Moran’s I to explain spatial characteristics of homicide. You will likely see this in other papers as well, and now you will know what it means and why it’s important.
7.5.1 Homework 3 (Optional Practice)
You know what is coming, don’t you? Yes, you need to compute the Moran’s I for burglary again. But this time, you need to do it for the whole of Manchester city. I won’t mark you down if you don’t try this. But why wouldn’t you?
7.6 Week 7 : Local spatial autocorrelation
7.7 Prelude
Last week we learned about global measures of spatial association, in particular the Moran’s I statistic, which provide a mechanism to make inferences about a population from a sample. While this is useful for us to be able to assess quantitatively whether crime events cluster in a non-random manner, in the words of Jerry Ratcliffe “this simply explains what most criminal justice students learn in their earliest classes.” For example, consider the study of robberies in Philadelphia:

Aggregated to the police districts, this returns a global Moran’s I value (range 1 to 1) of 0.56, which suggests that police sectors with high robbery counts adjoin sectors that also have high robbery counts, and low crime sectors are often neighbors of other low crime sectors, something that should hardly be surprising given the above map (Ratcliffe, Jerry. “Crime mapping: spatial and temporal challenges.” Handbook of quantitative criminology. Springer, New York, NY, 2010. 5-24.).
In today’s session we will learn about local indicators of spatial association (LISA) and show how they allow for the decomposition of global indicators, such as Moran’s I, into the contribution of each observation. The LISA statistics serve two purposes. On one hand, they may be interpreted as indicators of local pockets of nonstationarity, or hot spots. On the other hand, they may be used to assess the influence of individual locations on the magnitude of the global statistic and to identify “outliers” (Anselin, Luc. “Local indicators of spatial association—LISA.” Geographical analysis 27.2 (1995): 93-115.).
These are the packages we will use:
library(tmap)
library(dplyr)
library(sp)
library(sf)
library(spdep)
7.8 Get data and built the weight matrix
For this week we will be going back to the data from last week about burglary in Manchester city. Below is the code to build this. This is code we have already used and covered in previous sessions. If any of this is confusing, or you need a reminder just refer to last week’s notes!
Get the boundary data:
shp_name <- "data/BoundaryData/england_lsoa_2011.shp"
manchester_lsoa <- st_read(shp_name)
## Reading layer `england_lsoa_2011' from data source `/Users/reka/Dropbox (The University of Manchester)/crimemapping_textbook_bookdown/data/BoundaryData/england_lsoa_2011.shp' using driver `ESRI Shapefile'
## Simple feature collection with 282 features and 3 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 378833.2 ymin: 382620.6 xmax: 390350.2 ymax: 405357.1
## epsg (SRID): NA
## proj4string: +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +datum=OSGB36 +units=m +no_defs
Transform the CRS to WGS84:
lsoa_WGS84 <- st_transform(manchester_lsoa, 4326)
Read the crimes data into R:
crimes <- read.csv("https://raw.githubusercontent.com/jjmedinaariza/CrimeMapping/master/gmpcrime.csv")
Subset the data:Filter out to select burglary:
burglary <- filter(crimes, crime_type == "Burglary")
Of course this is just a dataframe, and not (yet) a spatial object. To map, we must first transform into spatial object:
burglary_spatial <- st_as_sf(burglary, coords = c("long", "lat"),
crs = 4326, #set the CRS to WGS84
agr = "constant") #the 'agr' attribute-geometry-relationship, specifies for each non-geometry attribute column how it relates to the geometry. "constant" is used for attributes that are constant throughout the geometry (e.g. land use)
Now we want to select burglaries inside Manchester City. We can use st_intersects() to select those that intersect with the Manchester city LSOA map.
#intersets
bur_m <- st_intersects(lsoa_WGS84, burglary_spatial)
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
#subsetting
bur_m <- burglary_spatial[unlist(bur_m),]
Count the number of crimes in each LSOA by using the point in polygon spatial operation.
(remember this might take a while, it’s checking every single crime point, which you can see is over 400,000 points!)
burglaries_per_lsoa <- bur_m %>%
st_join(lsoa_WGS84, ., left = FALSE) %>%
count(code)
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
You can see the column with the count of burglaries is called ‘n’. (You can View() your data or use the names() function to check what your variables are called). This is not great variable naming, so let’s rename n into something more meaningful:
burglaries_per_lsoa <- rename(burglaries_per_lsoa, burglary = n)
Now that’s all our data wrangling done. Woo!
Let’s remove all the dataframes and objects we’ve read in but no longer need, now that we have arrived at our final dataset, burglaries_per_lsoa.
#Remove redundant objects
rm(burglary_spatial)
rm(manchester_lsoa)
#Remove redundant non spatial burglary objects
rm(burglary)
rm(crimes)
And finally, let’s see our map! Plot with tmap:
tm_shape(burglaries_per_lsoa) +
tm_fill("burglary", style = "quantile", palette = "Reds") +
tm_borders(alpha = 0.1) +
tm_layout(main.title = "Burglary counts", main.title.size = 0.7 ,
legend.position = c("right", "bottom"), legend.title.size = 0.8)
## legend.postion is used for plot mode. Use view.legend.position in tm_view to set the legend position in view mode.
Now that we have the data we need to coerce our newly created sf object into sp and then generate the wieght matrix. Again, what we do here is stuff we cover last week. In fact, if you did the optional homework you will also have run this code.
#Coerce sf into sp
burglary_m <- as(burglaries_per_lsoa, "Spatial")
#Generate list of neighbours using the Queen criteria
w <- poly2nb(burglary_m, row.names=burglary_m$lsoa_code)
#Generate list with weights using row standardisation
ww <- nb2listw(w, style='W')
7.9 Generating and visualising the LISA measures
The first step before we can generate the LISA map is to compute the local Moran’s I. The initial part of the video presentation by Luc Anselin that we expected you to watch explains the formula and logic underpinning these computations and we won’t reiterate here that detail. But at least a a general reminder:
“global tests for spatial autocorrelation introduced last week are calculated from the local relationship between the values observed at a spatial entity and its neighbours, for the neighbourhood definition chosen. Because of this, we can break global measures down into their components, and by extension, construct localised tests intended to detect ‘clusters’ -observations with very similar neighbours- and” spatial outliers “observations with very different neighbours” (Bivand et al. 201)
Let’s first look at the Moran’s scatterplot:
moran.plot(burglary_m$burglary, ww)

Notice how the plot is split in 4 quadrants. The top right corner belongs to areas that have high level of burglary and are surrounded by other areas that have above the average level of burglary. This are the high-high locations that Luc Anselin referred to. The bottom left corner belongs to the low-low areas. These are areas with low level of burglary and surrounded by areas with below average levels of burglary. Both the high-high and low-low represent clusters. A high-high cluster is what you may refer to as a hot spot. And the low-low clusters represent cold spots. In the opposite diagonal we have spatial outliers. They are not outliers in the standard sense, extreme observations, they are outliers in that they are surrounded by areas that are very unlike them. So you could have high-low spatial outliers, areas with high levels of burglary and low levels of surrounding burglary, or low-high spatial outliers, areas that have themselves low levels of burglary (or whatever else we are mapping) and that are surrounded by areas with above average levels of burglary.
You can also see here that the positive spatial autocorrelation is more noticeable when we focus on the whole of Manchester city, unlike what we observed when only looked at Manchester city. You can check this running the global Moran’s I.
moran(burglary_m$burglary, ww, n=length(ww$neighbours), S0=Szero(ww))
## $I
## [1] 0.319477
##
## $K
## [1] 33.59689
moran.mc(burglary_m$burglary, ww, nsim=99999)
##
## Monte-Carlo simulation of Moran I
##
## data: burglary_m$burglary
## weights: ww
## number of simulations + 1: 1e+05
##
## statistic = 0.31948, observed rank = 1e+05, p-value = 1e-05
## alternative hypothesis: greater
You can see that the global Moran’s is 0.32 and that is highly significant. There is indeed global spatial autocorrelation. Knowing this we can try to decompose this, figure out what is driving this global measure.
To compute the local Moran we can use a function from the spdep package.
locm_bm <- localmoran(burglary_m$burglary, ww)
summary(locm_bm)
## Ii E.Ii Var.Ii
## Min. :-1.933870 Min. :-0.003559 Min. :0.07787
## 1st Qu.:-0.004654 1st Qu.:-0.003559 1st Qu.:0.14505
## Median : 0.123084 Median :-0.003559 Median :0.17460
## Mean : 0.319477 Mean :-0.003559 Mean :0.17805
## 3rd Qu.: 0.314142 3rd Qu.:-0.003559 3rd Qu.:0.21894
## Max. :13.201230 Max. :-0.003559 Max. :0.44062
## Z.Ii Pr(z > 0)
## Min. :-4.12540 Min. :0.0000
## 1st Qu.:-0.00256 1st Qu.:0.2202
## Median : 0.29992 Median :0.3821
## Mean : 0.84062 Mean :0.3707
## 3rd Qu.: 0.77152 3rd Qu.:0.5010
## Max. :37.50929 Max. :1.0000
The first column provides the summary statistic for the local moran statistic. Being local you will have one for each of the areas. The last column gives you a p value for this statistic. In order to produce the LISA map we need to do some previous work. First we are going to create some new variables that we are going to need:
First we scale the variable of interest. When we scale burglary what we are doing is re-scaling the values so that the mean is zero. See an explanation of what this does here.
We use scale(), which is a generic function whose default method centers and/or scales the variable. Here centering is done by subtracting the mean (omitting NAs) the corresponding columns, and scaling is done by dividing the (centered) variable by their standard deviations:
#scale the variable of interest and save it to a new column
burglary_m$S_burglary <- scale(burglary_m$burglary) %>% as.vector()
We’ve also added as.vector() to the end, because
Now we also want to account for the spatial dependence of our values. Remember how we keep saying about “The First Law of Geography”, according to Waldo Tobler, is “everything is related to everything else, but near things are more related than distant things.”
So what do we mean by this spatial dependence? When a value observed in one location depends on the values observed at neighboring locations, there is a spatial dependence. And spatial data may show spatial dependence in the variables and error terms.
Why should spatial dependence occur? There are two reasons commonly given. First, data collection of observations associated with spatial units may reflect measurement error. This happens when the boundaries for which information is collected do not accurately reflect the nature of the underlying process generating the sample data. A second reason for spatial dependence is that the spatial dimension of a social or economic characteristic may be an important aspect of the phenomenon. For example, based on the premise that location and distance are important forces at work, regional science theory relies on notions of spatial interaction and diffusion effects, hierarchies of place and spatial spillovers.
There are two types of dependence, spatial error and spatial lag. Here we focus on spatial lag.
Spatial lag is when the dependent variable y in place i is affected by the independent variables in both place i and j.

This will be important to keep in mind when considering spatial regression. With spatial lag in ordinary least square regression, the assumption of uncorrelated error terms is violated, becasuse near things will have associated error terms. Similarly, the assumption of independent observations is also violated, as the observations are influenced by the other observations near them. As a result, the estimates are biased and inefficient. Spatial lag is suggestive of a possible diffusion process – events in one place predict an increased likelihood of similar events in neighboring places.
So to be able to account for the spatial lag in our model, we must create a variable to account for this. We can do this with the lag.listw() function. Remember from last week: a spatial lag in this context is simply the average value of the burglary count in the areas that are considered neighours of each LSOA. So we are plotting the value of burglary against the average value of burglary in the neighbours.
For this we need our listw object, which is the ww object created earlier, when we generated the list with weights using row standardisation. We then pass this listw object into the lag.listw() function, which computes the spatial lag of a numeric vector using a listw sparse representation of a spatial weights matrix.
#create a spatial lag variable and save it to a new column
burglary_m$lag_s_burglary <- lag.listw(ww, burglary_m$S_burglary)
Make sure to check the summaries to ensure nothing weird is going on
summary(burglary_m$S_burglary)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.2149 -0.6048 -0.1908 0.0000 0.2667 9.0256
summary(burglary_m$lag_s_burglary)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.99700 -0.44086 -0.07464 0.06068 0.34738 3.38679
We can create a Moran scatter plot so that you see that nothing has changed apart from the scale in wich we are using the variables. The observations that are influential are higlighted in the plot as you can see.
x <- burglary_m$S_burglary
y <- burglary_m$lag_s_burglary
xx <- data_frame(x,y)
moran.plot(x, ww)

We are now going to create a new variable to identify the quadrant in which each observation falls within the Moran Scatter plot, so that we can tell apart the high-high, low-low, high-low, and low-high areas. We will only identify those that are significant according to the p value that was provided by the local moran function.
Before we get started, let’s quickly review the tools we will use.
All our data is in this burglary_m dataframe. This has a variable for the LSOA code (code), a variable for the number of burglaries (burglary), and then also the two variables we created, the scaled measure of burglary (S_burglary), and the spatial lag measure (lag_s_burglary).
We also have our locm_bm object, which we created with the localmoran() function, that has calculated a variety of measures for each of our observations, which we explored with the summary() function. You can see (if you scroll up) that the 5th element in this object is the p-value (“Pr(z > 0)”). To call the nth element of an object, you can use the square brackets after its name. So to return the nth column of thing, you can use thing[,n]. Again this should not be new to you, as we’ve been doing this sort of thing for a while.
So the data we need for each observation, in order to identify whether it belongs to the high-high, low-low, high-low, or low-high quadrants are the standardised burglary score, the spatial lag score, and the p-value.
Essentially all we’ll be doing, is assigning a variable values based on where in the plot it is. So for example, if it’s in the upper right, it is high-high, and has values larger than 0 for both the burglary and the spatial lag values. It it’s in the upper left, it’s low-hih, and has a value larger than 0 for the spatial lag value, but lower than 0 on the burglary value. And so on, and so on. Here’s an image to illustrate:

So let’s first initialise this variable. In this instance we are creating a new column in the burglary_m dataframe and calling it quad_sig.
We are using the mutate() function from the tidyverse package to create our new variable, just as we have in previous labs.
We also use nested ifelse() statements. Nested ifelse() just means that it’s an ifelse() inside and ifelse() statement. To help us with these sorts of situations is the ifelse() function. We saw this with the previous exercises, but I’ll describe it brielfy again. It allows us to conditionally assign some value to some variable. The structure of the function is so that you have to pass it a condition, then a value to assign if the condition is true, and then another value if the condition is false. You are basically using this function to say: “if this condition is true, do first thing, else, do the second thing”. It would look something like this:
dataframe$new_variable <- ifelse(dataframe$some_numeric_var < 100, "smaller than 100", "not smaller than 100")
When nesting these, all you do is put another condition to check in the “thing to do if false”, so it checks all conditions. So in the first instance we check if the value for burglary is greater than zero, and the value for the lag is greater than zero, and the p-value is smaller than our threshol of 0.05. If it is, then this should belong to the “high-high” group. If any one of these conditions is not met, then we move into the ‘thing to do if false’ section, where we now check agains another set of criteria, and so on and so on. If none of these are met, we assign it the non-significant value:
burglary_m <- st_as_sf(burglary_m) %>%
mutate(quad_sig = ifelse(burglary_m$S_burglary > 0 &
burglary_m$lag_s_burglary > 0 &
locm_bm[,5] <= 0.05,
"high-high",
ifelse(burglary_m$S_burglary <= 0 &
burglary_m$lag_s_burglary <= 0 &
locm_bm[,5] <= 0.05,
"low-low",
ifelse(burglary_m$S_burglary > 0 &
burglary_m$lag_s_burglary <= 0 &
locm_bm[,5] <= 0.05,
"high-low",
ifelse(burglary_m$S_burglary <= 0 &
burglary_m$lag_s_burglary > 0 &
locm_bm[,5] <= 0.05,
"low-high",
"non-significant")))))
(Note we had to wrap our data in a st_as_sf() function, to covert back to sf object).
Now we can have a look at what this returns us:
table(burglary_m$quad_sig)
##
## high-high low-low non-significant
## 22 1 259
This looks like a lot of non-significant results. We want to be sure this isn’t an artefact of our code but is true, we can check how many values are under 0.05:
nrow(locm_bm[locm_bm[,5] <= 0.05,])
## [1] 23
We can see that only 23 areas have p-values under 0.05 threshold. So this is in line with our results, and we can rest assured.
Well, this is exciting, but where are these regions?
Let’s put ’em on a map, just simply, using quick thematic map (qtm()):
qtm(burglary_m, fill="quad_sig", fill.title="LISA")
Very nice!

So how do we interpret these results? Well keep in mind:
- The LISA value for each location is determined from its individual contribution to the global Moran’s I calculation.
- Whether or not this value is statistically significant is assessed by comparing the actual value to the value calculated for the same location by randomly reassigning the data among all the areal units and recalculating the values each time (the Monte Carlo simulation approach discussed earlier).
So essentially this map now tells us that there was statistically significant moderate clustering in burglaries in Manchester. When reporting your results, report at least the Moran’s I test value and the p value. So, for this test, you should report Moran’s I = 0.32, p < .001. Including the LISA cluster map is also a great way of showing how the attribute is actually clustering.